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1.  Introduction 


In  critical  and  life  threatening  situations  faced  by  Soldiers  on  the  battlefield,  timely 
response  to  complex  information  is  required.  In  such  situations,  battlefield 
computation  can  help  to  distill  data  into  actionable  information  that  can  lead  to 
better  decision-making  and  outcomes.  However,  computing  power  has  been 
historically  limited  on  the  tactical  edge  due  to  size,  weight,  and  power  constraints 
of  mobile  devices.  One  potential  solution  to  this  problem  is  to  offload  the 
computation  to  a  more  powerful  computer  to  obtain  an  answer  as  fast  as  possible. 
Computation  offloading  provides  an  alternative  strategy  in  which  the  user’s 
computational  job  is  communicated  to  another  computing  device  with  greater 
processing  power  that  processes  the  data  and  communicates  the  result  back  to  the 
user.  The  potential  advantages  of  computation  offloading  include  decreased 
computation  time,  decreased  battery  consumption  due  to  computation,  and  possibly 
increased  security  and  resiliency.  However,  these  features  come  at  the  cost  of  the 
introduction  of  communication  latency  and  increased  battery  consumption  due  to 
communication.  It  is  therefore  important  to  design  the  computation  mechanism 
intelligently  so  as  to  maximize  the  advantages  and  minimize  the  disadvantages. 

Another  important  aspect  to  consider  is  the  continually  evolving  nature  of 
computers  and  processing  power.  The  power  available  today  on  most  handheld 
devices  is  equivalent  to  the  power  available  on  a  desktop  only  a  few  years  ago.  It  is 
important  to  consider  how  far  into  the  future  computing  on  various  devices  will  be 
possible  when  considering  long-tenn  planning.  In  this  report,  we  present  the 
development  of  a  computation  offloading  model  used  to  determine  the  best  strategy 
to  minimize  total  response  time  now  and  in  the  future. 

In  Section  2,  we  discuss  related  and  previous  work  in  this  area.  In  Section  3,  we 
detail  our  motivation  for  this  work  and  explain  why  our  high-level  approach  to 
modeling  is  an  important  step  in  the  design  of  an  offloading  system.  In  Section  4, 
we  describe  the  objectives  of  our  model  and  the  process  we  used  to  develop  it. 
Section  5  details  the  types  of  analysis  that  are  possible  using  a  simple  instantiation 
of  the  model  we  have  described.  Finally,  in  Section  6  we  present  some  conclusions 
based  on  initial  results  and  explore  potential  future  work. 

2.  Related  Work 


As  mobile  technology  has  matured  and  increased  in  popularity,  there  have  been 
many  efforts  to  reduce  energy  requirements  while  increasing  the  computational 
power  of  the  devices.  The  concept  of  offloading  computations  to  nearby  or  even 
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distance  resources  has  been  widely  explored.  Previous  studies12  have  shown 
certain  portions  of  the  code  are  offloaded  to  increase  overall  performance  both 
in  terms  of  increased  computation  power  and  decreased  energy  consumption.  This 
shows  that  offloading  is  feasible  in  various  situations;  however,  this  work  does  not 
concentrate  on  the  time  sensitive  analysis  that  is  needed  on  the  modem  battlefield. 

With  computation  offloading  comes  an  overhead  cost  of  transferring  data  to 
the  remote  device  and  transferring  the  answer  back.  Therefore,  the  benefits  of 
computing  on  other  devices  must  outweigh  the  data  transfer  time.  The  total 
cost  (transfer  time  plus  computation  time)  has  been  explored.3  There  have  been 
studies4  that  concentrated  on  minimizing  energy  consumption  as  the  main 
objective  for  offloading.  This  is  important  to  consider  in  a  time  critical 
scenario,  but  it  cannot  be  the  sole  consideration.  Obtaining  the  computational 
results  as  quickly  as  possible  while  maintaining  lower  energy  consumption  is 
often  a  useful  strategy. 

Another  idea  that  has  been  explored  is  to  provide  a  software -based  framework 
that  will  offload  portions  of  the  application  to  any  available  hardware. 5,6,7 
Each  of  these  provides  a  framework  or  software  interface  that  will  offload 
various  modules  within  the  application  to  the  cloud  without  the  programmers 
explicitly  specification  which  is  different  than  previously  discussed  work 
because  the  software  based  framework  makes  the  offloading  decisions. 
However,  this  strategy  does  not  predict  the  potential  performance  gained  and 
there  is  no  guarantee  of  the  performance  that  will  be  achieved. 

Delegation  is  another  computation  strategy  used  by  mobile  application 
developers  to  decrease  computation  time.  In  another  study,8  the  authors 
compare  offloading  to  delegation  to  determine  the  benefits  and  shortcomings 
of  each  in  different  scenarios  with  current  technology.  It  was  determined  that 
offloading  is  more  feasible  with  current  mobile  technology,  which  is  vital  for 
success  on  the  battlefield. 

The  use  of  a  cloudlet  (or  collection  of  mobile  devices  with  cloud-like  services) 
for  computation  offloading  has  been  studied9,10  and  shown  to  reduce  energy 
costs  while  maintaining  an  acceptable  computation  rate.  Under  this  strategy, 
high-powered  computers  can  be  geographically  farther  apart  and  reserved 
exclusively  for  very  computationally  intensive  applications.  This  reduces 
energy  costs  overall;  however,  the  acquisition  and  deployment  of  additional 
physical  hardware  may  be  necessary. 
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3.  Motivation 


The  purpose  of  this  work  is  to  develop  a  set  of  models  that  we  will  use  to  evaluate 
and  compare  alternative  computation  strategies  at  a  conceptual  level.  These 
models  have  several  purposes.  First,  they  motivate  and  quantify  the  need  for 
computation  offloading,  particularly  in  tactical  environments.  They  also  allow 
rapid  evaluation  of  alternative  computation  strategies  under  a  variety  of  conditions. 
Our  models  can  also  be  applied  to  time-critical  applications  in  nomnilitary 
settings,  such  as  emergency  response. 

For  all  of  the  computation  strategies  considered  in  this  work,  we  are  interested  in 
analyzing  the  total  response  time,  r  of  the  system,  which  we  define  as  the  time  it 
takes  to  provide  a  response  to  the  user.  Total  response  time  can  be  computed  as 
the  sum  of  the  communication  time  and  the  computation  time.  We  define 
communication  time  as  the  time  that  it  takes  to  transfer  the  required  data  to  the 
device  that  performs  the  computation  and  to  transfer  the  resultant  data  back  to  the 
user.  We  define  computation  time  as  simply  the  time  required  for  the  target  device 
to  process  the  data. 

In  any  setting,  perfonning  computation  solely  on  the  user’s  handheld  device  will 
require  no  communication  time,  but  relatively  high-computation  time.  In  a 
commercial  setting  (with  high-bandwidth  network  infrastructure),  data  can  be  sent 
very  quickly  to  a  remote  data  center  with  very  high-computational  capacity,  so 
computation  time  can  be  reduced  dramatically  at  the  cost  of  only  a  relatively 
small  increase  in  communication  time.  While  this  communication  latency  might 
be  intolerable  for  real-time  interactive  applications  (e.g.,  augmented  reality),  it  is 
acceptable  for  many  common  applications  (e.g.,  route  planning).  As  long  as  total 
response  time  is  acceptable,  monetary  cost  minimization  is  the  dominant  factor  in 
designing  a  commercial  computation  offloading  system. 

In  a  military  setting,  the  network  infrastructure  is  often  disrupted,  intermittent, 
and  low-bandwidth.  This  greatly  increases  communication  time,  which  opens  a 
potential  technology  gap.  It  might  be  possible  to  fill  this  gap  by  deploying 
strategic  (regional),  tactical  (local)11,  or  mobile  (1-hop)  High-Performance 
Computers  (HPCs)  within  the  network  providing  new  offloading  targets  between 
the  centralized  data  center  and  the  user’s  handheld  device.  We  illustrate  this  gap 
visually  in  Fig.  1 . 

As  shown  in  Fig.  1,  as  the  distance  between  the  user’s  handheld  device  and  the 
device  that  performs  the  computation  increases  (in  terms  of  number  of  network 
hops),  there  are  2  important  changes.  First,  the  communication  time  increases 
because  the  data  must  be  sent  farther  through  the  network.  Second,  the 
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computation  time  decreases  because  the  data  can  reach  a  more  powerful  computer.  Based 
on  these  ideas,  in  this  work  we  develop  a  model  that  combines  existing  models  of  computer 
and  network  performance  along  with  reasonable  system  architecture  assumptions  and 
parameter  estimates. 


Computation  Time  Computation  Time 


Fig.  1  Conceptual  sketch  of  various  regions  in  the  space  of  computation  strategies 

4.  Computation  Offloading  Model 

When  designing  our  model,  our  2  major  goals  were  simplicity  and  modularity.  We  wanted 
the  model  to  not  necessarily  answer  every  question  for  every  scenario,  but  rather  expose 
easy  to  adjust  values  that  allow  designers  to  model  their  situation  of  interest.  By  designing 
the  model  modularly,  we  minimize  the  interactions  between  the  various  components, 
localizing  the  effects  of  a  change  to  any  one  component. 

Our  model  describes  the  relationships  between  many  components,  including  the 
computational  job,  the  network,  the  available  compute  devices,  the  computation  strategy, 
utility  to  the  user,  and  the  development  of  technology  over  time.  The  Computational  Job 
characterizes  the  computation  that  the  user  wishes  to  perform.  A  set  of  Compute  Devices 
describes  computers  that  are  potentially  available  to  execute  the  Computational  Job.  The 
Network  describes  the  communication  network  that  can  transfer  data  between  the  user’s 
handheld  device  and  other  Compute  Devices. 

4.1  Computational  Job 

We  describe  a  computational  job  with  several  parameters:  the  number  of  operations 
necessary  to  execute  the  job,  the  deadline  by  which  the  user  needs  a  response,  and  the  max 
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utility,  umax,  that  is  achieved  for  providing  a  response  to  the  user  before  the  deadline,  d, 
the  size  of  the  required  data  to  transfer,  and  the  fraction  of  operations  that  can  be  executed 
in  parallel. 

4.2  Utility  Function 

The  system  response  time  is  sent  to  the  Utility  Function,  which  calculates  the  utility  to 
the  end  user  based  on  the  parameters  of  the  Computational  Job.  While  intuitively  a  shorter 
total  response  time  seems  like  it  would  always  be  preferred,  in  some  cases  a  faster  response 
may  not  be  beneficial.  For  example,  in  a  mission  planning  scenario,  the  solution  may  be 
required  by  a  specific  time.  As  long  as  the  solution  is  provided  before  that  time,  it  has  the 
same  utility  to  the  user.  After  that  time,  however,  the  solution  is  not  useful,  as  a  decision 
would  have  had  to  be  made  using  an  alternative  method.  As  such,  defining  a  utility  function, 
u{t),  is  a  critical  step  in  analyzing  the  affect  of  tradeoffs  of  system  parameters  (Fig.  2). 
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Fig.  2  Utility  step  function 


In  the  mission  planning  situation  described  above,  the  utility  function  is  a  step  function. 
That  is,  if  the  response  time  is  within  the  deadline,  then  the  system  has  provided  the  max 
utility.  Otherwise,  the  system  has  provided  0  utility.  This  simple  utility  function  can  be 
written  as 


it 


(it 

'  to. 


max>  T  <  d 

otherwise 


(1) 


Of  course,  a  system  designer  should  customize  the  Utility  Function  by  specifying  a  more 
complex  function  of  response  time  (multiple  steps,  piecewise  linear,  exponential,  etc.)  to 
accurately  reflect  the  scenario  at  hand. 
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4.3  System  Latency 


As  described  previously,  the  2  major  components  of  the  model  are  communication  time  and 
computation  time.  For  simplicity  of  exposition,  in  this  work  we  adopt  the  most  naive 
definition  of  communication  latency,  lcomm,  which  we  define  to  be 

Icomm  (/-  h,  BWy,  y)  =  Ef=1  =  Z?=  1  BWbp]L5y  ’  (2) 

where /is  the  file  size  of  the  data  necessary  to  transfer,  BW0  is  the  average  bandwidth  of  the 
hth  hop  of  the  channel  between  the  user  and  the  target  computation  device  in  the  current  year, 
and  y  is  the  number  of  years  in  the  future.  This  value  is  a  good  indicator  of  the  performance 
of  a  current  system,  but  also  can  help  detennine  the  best  hardware  acquisitions  for  future 
systems.  Again,  because  of  the  modularity  of  the  model,  it  is  possible  for  designers  to 
substitute  the  most  accurate  value  for  communication  latency  that  can  be  obtained  in  their 
system. 

It  is  necessary  to  use  a  slightly  less  naive  model  of  computation  to  observe  realistic  effects. 
We  compute  the  computation  latency  as 

Icomp  —  p  ,s  i  (2) 


where  O  is  the  total  number  of  operations,  F  is  the  number  of  FLOPS  per  processor,  and, 
from  Amdahl’s  law12,  the  theoretical  speedup  of  multithreaded  execution,  S,  can  be 
computed  as 


5  = 


i 

(i -p)+^-' 

Up 


(4) 


Here,  p  is  the  proportion  of  the  application  that  is  parallelizable.  To  model  the  total  system 
latency,  we  add  the  communication  and  computation  latencies  to  obtain 

l  (j f,0,p,h,BW0,np,F0,y ) 

l  comm  C f,h,BWy,y)  +  Icomp  (O,p,np,F0,y)  .  (5) 

Equation  5  is  the  main  contribution  of  this  work.  It  clearly  lays  out  how  an  offloading  system 
designer  can  substitute  known  values  of  their  hardware  and  applications  to  obtain  a  general 
idea  of  system  perfonnance. 
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4.4  Usefulness  Over  Time 


Figure  3  illustrates  how  the  development  of  technology  over  time  is  captured  by  our  model. 
The  thick  blue  lines  represent  the  development  of  technologies  related  to  the  Network  (e.g., 
increase  in  bandwidth),  and  Compute  Devices  (e.g.,  increase  in  FLOPS).  By  repeatedly 
applying  these  transformations,  we  can  attempt  to  predict  how  user  utility  will  change  in 
the  future.  The  computation  strategy  that  performs  best  with  today’s  technology  might  not 
be  the  same  computation  mechanism  that  performs  best  with  the  technology  that  will  be 
fielded  5  to  10  years  from  now.  This  helps  designers  identify  which  strategy  is  most 
worthwhile  to  pursue  for  lasting  effectiveness. 


utility  parameters 


Fig.  3  An  illustration  of  the  relationships  between  the  major  model  components.  The  moving  from  left 
to  right  over  the  dashed  vertical  line  indicates  a  step  1  year  forward  in  time 

5.  Analysis 

To  demonstrate  how  a  system  designer  can  use  our  model,  we  illustrate  the  simplest 
computation  strategy  to  not  offload  the  computational  job,  but  to  simply  execute  it  on  the 
user’s  handheld  device.  Since  no  data  is  being  sent  over  the  network,  there  is  no  need  to 
model  any  network  communication.  The  user’s  handheld  device  contains  a  single  processor, 
so  there  is  no  opportunity  for  parallel  execution.  We  take  Utility  to  be  a  step  function.  The 
deadline  and  max  utility  are  provided  as  model  inputs  that  are  part  of  the  Computational 
Job. 
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In  the  following  table,  we  provide  some  realistic  sample  values  that  we  will  use  to 
demonstrate  the  application  of  our  model.  In  Fig.  4,  we  plot  the  communication  and 
computation  latency,  as  well  as  the  derived  total  response  time  as  a  function  of  distance  from 
the  user  using  the  values  in  the  table. 

Table  Estimated  values  versus  distance 


Distance  From  User 

(m)a 

Hops  From  User 

Number  of  Processors 

Bandwidth 

(Mbps)b 

10 

1 

8 

0.5 

100 

2 

64 

0.5 

1000 

3 

512 

40 

10000 

4 

4096 

40 

100000 

5 

32768 

200 

am,  meters;  bMbps,  megabits  per  second. 


Fig.  4  Prediction  of  communication,  computation,  and  total  response  time  vs.  distance  from  user 

In  this  example,  the  user  will  be  executing  a  job  where  the  number  of  operations  needed  to 
complete  the  job  is  provided  as  a  model  input  that  is  part  of  the  Computational  Job. 
According  to  Moore’s  law13,  transistor  counts  double  every  18  months,  which  corresponds 
to  an  increase  of  60%  per  year.  We  take  this  value  for  our  FLOPS  change  over  time.  We 
set  FLOPS  at  time  0  (the  year  2014)  to  2.5  billion  (based  on  the  2.5-GHz  clock  speed  of  a 
standard  smart  phone,  and  assumes  1  operation  per  clock  cycle),  and  arrive  at  the 
expression  for  the  improvement  of  the  user’s  computation  capability  over  time  as  FLOP  S(t) 
=  FLOP  S,.  1.6'.  With  these  assumptions,  we  see  in  Fig.  5  that  by  simply  waiting  3  years,  the 
user  can  obtain  considerable  more  utility  without  changing  any  other  system  parameters. 
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Fig.  5  Response  time  and  utility  as  a  function  of  years  into  the  future 

Decision  Boundaries 

One  of  the  most  powerful  ways  that  our  model  can  be  employed  is  to  make  a  decision  as  to 
where  to  offload  a  computational  job.  After  a  system  designer  specifies  all  of  the  model 
parameters  corresponding  to  their  scenario,  a  chart,  as  shown  in  Fig.  6,  can  be  produced 
to  give  an  “at-a-glance”  view  of  the  behavior  of  the  system  where  the  decision  of  where  to 
offload  any  job  can  be  read  immediately.  To  generate  these  decisions,  we  choose  the 
offloading  target  with  the  minimum  system  response  time.  That  is,  we  can  find  the  ideal 
target  computation  device  by  optimizing: 

h*  =  min  r(h)  (6) 

where  h*  is  the  number  of  hops  from  the  user  where  the  ideal  offload  target  resides,  and  r(h) 
is  the  system  response  time  for  a  device  h  hops  away  from  the  user.  In  Fig.  6,  we  show  the 
decision  boundaries  for  the  system  described  by  our  running  example. 
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Fig.  6  Offloading  decision  boundaries 

In  Fig.  6  each  cell  corresponds  to  a  computational  job  with  different  attributes.  The  color  of 
the  cell  indicates  the  optimal  offloading  decision  that  should  be  made  for  a  job  with  the 
corresponding  attributes.  The  number  of  serial  operations  is  held  constant  over  all  jobs.  We 
noticed  clear  “bands”  of  job  attributes  where  offloading  to  a  particular  place  in  the  network 
is  optimal.  A  system  designer  can  use  this  type  of  output  to  quickly  see  what  the  effect  of 
changing  some  system  parameters,  like  the  number  of  processors  placed  at  a  particular  point 
in  the  network,  would  have  on  the  overall  system  response  time  for  particular  job  types. 

6.  Conclusions  and  Future  Work 


We  have  presented  a  simple,  modular  model  of  computation  on  the  modem  battlefield.  We 
have  shown  that  a  high-level  model  is  useful  for  studying  a  number  of  important  properties 
of  a  system  designed  to  provide  computational  assistance  to  an  end  user.  By  customizing 
such  a  model  with  the  existing  or  proposed  parameters  of  an  offloading  system,  system 
designers  can  make  quick,  intelligent  decisions  about  where  to  place  resources,  and  how 
to  best  take  advantage  of  them. 

In  future  work,  we  will  compute  error  measurements  as  compared  to  fielded  systems.  We 
will  thenperfonn  a  sensitivity  analysis  to  determine  the  accuracy  of  our  model,  as  well  as 
determine  which  parameters  the  model  is  most  sensitive  to,  informing  us  as  to  which  parts 
of  the  model  should  be  improved  first. 
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